[SPARK-19500] [SQL] Fix off-by-one bug in BytesToBytesMap by davies · Pull Request #16844 · apache/spark

davies · 2017-02-07T22:07:46Z

What changes were proposed in this pull request?

Radix sort require that half of array as free (as temporary space), so we use 0.5 as the scale factor to make sure that BytesToBytesMap will not have more items than 1/2 of capacity. Turned out this is not true, the current implementation of append() could leave 1 more item than the threshold (1/2 of capacity) in the array, which break the requirement of radix sort (fail the assert in 2.2, or fail to insert into InMemorySorter in 2.1).

This PR fix the off-by-one bug in BytesToBytesMap.

This PR also fix a bug that the array will never grow if it fail to grow once (stay as initial capacity), introduced by #15722 .

How was this patch tested?

Added regression test.

davies · 2017-02-07T22:08:28Z

cc @JoshRosen, @viirya

SparkQA · 2017-02-08T00:48:40Z

Test build #72541 has finished for PR 16844 at commit 61bceff.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

viirya · 2017-02-08T05:18:42Z

LGTM

viirya · 2017-02-08T06:17:06Z

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java

        isDefined = true;

-        if (numKeys > growthThreshold && longArray.size() < MAX_CAPACITY) {
+        if (numKeys >= growthThreshold && longArray.size() < MAX_CAPACITY) {


The re-allocated space might not be used if no further insertion. Shall we do growAndRehash at the beginning of append when numKeys == growthThreshold && !isDefined?

Unfortunately, we can't grow in the beginning, otherwise the pos will be wrong.

if it fail to grow once (stay as intial capacity).

davies · 2017-02-08T18:03:07Z

@viirya Addressed your comment, also fixed another bug (updated PR description).

SparkQA · 2017-02-08T18:05:03Z

Test build #72594 has finished for PR 16844 at commit d9aa208.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-02-08T18:30:15Z

Test build #3566 has finished for PR 16844 at commit d9aa208.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

mridulm · 2017-02-08T21:08:22Z

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java

+        try {
+          growAndRehash();
+        } catch (OutOfMemoryError oom) {
+          return false;


Unrelated, but this OutOfMemoryError will not be useful - atleast not in yarn mode.
It will simply cause the jvm to exit.

mridulm · 2017-02-08T21:10:26Z

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java

-        || !canGrowArray && numKeys > growthThreshold) {
-        return false;
+      if (numKeys >= growthThreshold) {
+        if (longArray.size() / 2 == MAX_CAPACITY) {


This does not look correct as per documentation of MAX_CAPACITY.
Actual number of keys == MAX_CAPACITY (so that total number of entries in longArray is MAX_CAPACITY * 2)

We grow the array when numKeys >= growthThreshold and growthThreshold = capacity * 0.5. But we actually allocate capacity * 2 spaces for the array.

So actually numKeys < growthThreshold = capacity * 0.5 < array length = capacity * 2 should hold true.

Because numKeys < growthThreshold is always true, if numKeys == MAX_CAPACITY, the capacity would be MAX_CAPACITY * 2 at least and the length of array will be more than MAX_CAPACITY * 4.

But in allocate, there is an assert of capacity <= MAX_CAPACITY. Looks like those condition are inconsistent.

Also, we need to move the appropriate validation check into growAndRehash() and not here.

There are two reason it will fail to grow: 1) current capacity (longArray.size() / 2) reach MAX_CAPACITY 2) can't allocate a array (OOM).

So, I think the checking here is correct.

@davies You are right that the check for longArray.size() / 2 == MAX_CAPACITY is the upper bound beyond which we cant grow. It is simply confusing it do it outside growAndRehash - which is what threw me off.
Please move the check into growAndRehash() and have it return true in case it could successfully grow the map.

@mridulm Do that means we should also rename the growAndRehash to tryGrowAndRehash? I think those are not necessary.

The invariant in question is for growAndRehash() - not append, and as such should live there.
Code evolution causing grow to be invoked from elsewhere will require duplication of the invariant everywhere.

Btw, this is in line with all other data structures spark (and other frameworks) have.

viirya · 2017-02-09T14:11:13Z

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java

        longArray.set(pos * 2 + 1, keyHashcode);
        isDefined = true;
-
-        if (numKeys > growthThreshold && longArray.size() < MAX_CAPACITY) {


This longArray.size() < MAX_CAPACITY should be wrong condition.

longArray.size() is the next capacity for current grow strategy, it should be longArray.size() <= MAX_CAPACITY

viirya · 2017-02-10T02:30:53Z

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java

-        || !canGrowArray && numKeys > growthThreshold) {
-        return false;
+      if (numKeys >= growthThreshold) {
+        if (longArray.size() / 2 == MAX_CAPACITY) {


Is MAX_CAPACITY still the maximum number of keys as per documentation of it? If we can have longArray.size() / 2 == MAX_CAPACITY at most for the capacity, the actually numKeys should be MAX_CAPACITY / 2, because we need two long array entries per key, right?

@davies is correct; but it is a slightly unintuitive way to write the condition.

val currentSize = longArray.size() val newSize = currentSize * 2 val currentKeysLen = currentSize / 2 val newKeysLen = currentKeysLen * 2 if (newKeysLen > MAX_CAPACITY) then fail. that is if (currentKeysLen == MAX_CAPACITY) then fail // Since we allow only power of 2's for all these values. that is if (longArray.size() / 2 == MAX_CAPACITY)

Particularly given its location (in append as opposed to grow), it serves to be a bit more confusing that expected.

If the currentKeysLen above is the number of keys, it never equals to currentSize / 2. currentSize / 2 is actually the capacity we want to allocate (but actually we allocate double of it for the array).

Once the number of keys reaches growthThreshold (i.e., capacity * 0.5), we go to grow the array or fail the append. So the number of keys is always less than or equal to capacity * 0.5 which is currentSize * 0.5 * 0.5.

To clarify, the length's @davies and I mentioned are not actual number of keys in the map, but maximum number of keys possible in the map.

Ah, I see. It makes sense.

SparkQA · 2017-02-10T06:15:54Z

Test build #3571 has finished for PR 16844 at commit d9aa208.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

viirya · 2017-02-10T13:23:49Z

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java

-        // then we don't try to grow again if hit the `growthThreshold`.
-        || !canGrowArray && numKeys > growthThreshold) {
-        return false;
+      if (numKeys >= growthThreshold) {


I think we need to grow the array only if isDefined == false.

…ever grow" This reverts commit d9aa208.

SparkQA · 2017-02-15T21:31:17Z

Test build #72946 has finished for PR 16844 at commit 8f098aa.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

viirya · 2017-02-16T02:49:44Z

retest this please.

SparkQA · 2017-02-16T05:35:45Z

Test build #72977 has finished for PR 16844 at commit 8f098aa.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

JoshRosen

LGTM as well.

JoshRosen · 2017-02-16T18:31:23Z

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java

        // The map could be reused from last spill (because of no enough memory to grow),
        // then we don't try to grow again if hit the `growthThreshold`.
-        || !canGrowArray && numKeys > growthThreshold) {
+        || !canGrowArray && numKeys >= growthThreshold) {


This change makes sense to me because growthThreshold's Scaladoc says "The map will be expanded once the number of keys exceeds this threshold" and here we're considering the impact of adding an additional key (so this could have also been written as (numKeys + 1) > growthThreshold).

## What changes were proposed in this pull request? Radix sort require that half of array as free (as temporary space), so we use 0.5 as the scale factor to make sure that BytesToBytesMap will not have more items than 1/2 of capacity. Turned out this is not true, the current implementation of append() could leave 1 more item than the threshold (1/2 of capacity) in the array, which break the requirement of radix sort (fail the assert in 2.2, or fail to insert into InMemorySorter in 2.1). This PR fix the off-by-one bug in BytesToBytesMap. This PR also fix a bug that the array will never grow if it fail to grow once (stay as initial capacity), introduced by #15722 . ## How was this patch tested? Added regression test. Author: Davies Liu <davies@databricks.com> Closes #16844 from davies/off_by_one. (cherry picked from commit 3d0c3af) Signed-off-by: Davies Liu <davies.liu@gmail.com>

davies · 2017-02-17T17:36:11Z

Merging into master, 2.1, 2.0 branch.

## What changes were proposed in this pull request? Radix sort require that half of array as free (as temporary space), so we use 0.5 as the scale factor to make sure that BytesToBytesMap will not have more items than 1/2 of capacity. Turned out this is not true, the current implementation of append() could leave 1 more item than the threshold (1/2 of capacity) in the array, which break the requirement of radix sort (fail the assert in 2.2, or fail to insert into InMemorySorter in 2.1). This PR fix the off-by-one bug in BytesToBytesMap. This PR also fix a bug that the array will never grow if it fail to grow once (stay as initial capacity), introduced by #15722 . ## How was this patch tested? Added regression test. Author: Davies Liu <davies@databricks.com> Closes #16844 from davies/off_by_one. (cherry picked from commit 3d0c3af) Signed-off-by: Davies Liu <davies.liu@gmail.com>

Fix off-by-one bug in BytesToBytesMap

61bceff

viirya reviewed Feb 8, 2017

View reviewed changes

address comment, also fix another issue that the array will never grow

d9aa208

if it fail to grow once (stay as intial capacity).

mridulm reviewed Feb 8, 2017

View reviewed changes

viirya reviewed Feb 9, 2017

View reviewed changes

viirya reviewed Feb 10, 2017

View reviewed changes

Davies Liu added 2 commits February 15, 2017 11:24

Revert "address comment, also fix another issue that the array will n…

dc3c95d

…ever grow" This reverts commit d9aa208.

fix growing

8f098aa

JoshRosen approved these changes Feb 16, 2017

View reviewed changes

asfgit closed this in 3d0c3af Feb 17, 2017

Conversation

davies commented Feb 7, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

davies commented Feb 7, 2017

Uh oh!

SparkQA commented Feb 8, 2017

Uh oh!

viirya commented Feb 8, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davies commented Feb 8, 2017

Uh oh!

SparkQA commented Feb 8, 2017

Uh oh!

SparkQA commented Feb 8, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mridulm Feb 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Feb 10, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Feb 15, 2017

Uh oh!

viirya commented Feb 16, 2017

Uh oh!

SparkQA commented Feb 16, 2017

Uh oh!

JoshRosen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davies commented Feb 17, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

davies commented Feb 7, 2017 •

edited

Loading

mridulm Feb 9, 2017 •

edited

Loading